Semi-supervised Variable Weighting for Clustering
نویسندگان
چکیده
Semi-supervised learning, which uses a small amount of labeled data in conjunction with a large amount of unlabeled data for training, has recently attracted huge research attention due to the considerable improvement in learning accuracy. In this work, we focus on semisupervised variable weighting for clustering, which is a critical step in clustering as it is known that interesting clustering structure usually occurs in a subspace defined by a subset of variables. Besides exploiting both labeled and unlabeled data to effectively identify the real importance of variables, our method embeds variable weighting in the process of semi-supervised clustering, rather than calculating variable weights separately, to ensure the computation efficiency. Our experiments carried out on both synthetic and real data demonstrate that semisupervised variable weighting significantly improves the clustering accuracy of existing semi-supervised k -means without variable weighting, or with unsupervised variable weighting.
منابع مشابه
Extracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering
Although many studies have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper is aimed at proposing an efficient approach to elicit prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data in order to convert a blind clustering problem into a semi-supervised o...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملSemi-Supervised Fuzzy Clustering with Feature Discrimination
Semi-supervised clustering algorithms are increasingly employed for discovering hidden structure in data with partially labelled patterns. In order to make the clustering approach useful and acceptable to users, the information provided must be simple, natural and limited in number. To improve recognition capability, we apply an effective feature enhancement procedure to the entire data-set to ...
متن کاملSemi-supervised Clustering Ensemble by Voting
— Clustering ensemble is one of the most recent advances in unsupervised learning. It aims to combine the clustering results obtained using different algorithms or from different runs of the same clustering algorithm for the same data set, this is accomplished using on a consensus function, the efficiency and accuracy of this method has been proven in many works in literature. In the first part...
متن کامل